40 research outputs found

    Efficient similarity-based operations for data integration

    Get PDF
    Similarity-based operations, similarity join, similarity grouping, data integrationMagdeburg, Univ., Fak. fĂŒr Informatik, Diss., 2004von Eike Schalleh

    Cloud-Scale Entity Resolution: Current State and Open Challenges

    Get PDF
    Entity resolution (ER) is a process to identify records in information systems, which refer to the same real-world entity. Because in the two recent decades the data volume has grown so large, parallel techniques are called upon to satisfy the ER requirements of high performance and scalability. The development of parallel ER has reached a relatively prosperous stage, and has found its way into several applications. In this work, we first comprehensively survey the state of the art of parallel ER approaches. From the comprehensive overview, we then extract the classification criteria of parallel ER, classify and compare these approaches based on these criteria. Finally, we identify open research questions and challenges and discuss potential solutions and further research potentials in this field

    Advanced grouping and aggregation for data integration

    Get PDF

    DAS PROJEKT LOSTART.DE: EINE INTERNET-DATENBANK FÜR KULTURGUTVERLUSTE

    Get PDF
    Die Suche nach KulturgĂŒtern, die infolge des Zweiten Weltkrieges und des Nationalsozialismus geraubt wurden oder verloren gingen, ist auch heute noch eine aktuelle Aufgabe – nicht nur fĂŒr Kunsthistoriker, sondern auch fĂŒr betroffene Privatpersonen, Institutionen und natĂŒrlich die Politik. In diesem Beitrag wird das „Lost Art“-Projekt vorgestellt, in dessen Rahmen eine Web-Datenbank zur UnterstĂŒtzung dieser Suche entwickelt wurde. Die Datenbank umfasst eine Vielzahl von Informationen zu den registrierten KulturgĂŒtern und erlaubt unterschiedliche Such- und Navigationsmodi in verschiedenen Sprachen. Ausgehend von der Architektur dieses Systems werden Aspekte der Implementierung, der Recherchemöglichkeiten sowie des Datenaustausches zwischen der öffentlichen Web-Datenbank und der eigentlichen internen Datenbank beschrieben

    The Monetary Value of Information: ALeakage-Resistant Data Valuation

    No full text
    Abstract: The importance of information as amain asset of acompany or organization is widely acknowledged nowadays. The loss of or the unauthorized access to sensitive information are critical and can possibly send acompany into bankruptcy. Furthermore, the risk of information larceny ismost often not caused by adirect attack ofunauthorized outsiders, but by authorized extractions by malicious or unaware insiders passing data to unauthorized outsiders. Unfortunately, this problem cannot be solved by the typically used role-based authentication. The detection of malicious accesses based on typical access characteristics, which has inspired some research, is limited in its potential. Therefore, we present aconceptual approach based on the valuation of information, i.e., using adescription of the actual worth of data items within database systems. This allows to rate potential losses on the fly as well as preventing valuable extractions done by insiders. In detail, we describe amechanism called leakage-resistant data valuation that calculates amonetary value for every query and takes according action if the cumulated monetary value exceeds athreshold (per query or per time span).
    corecore